Conference Proceedings

ClusiVAT: A mixed visual/numerical clustering algorithm for big data

D Kumar, M Palaniswami, S Rajasegarar, C Leckie, JC Bezdek, TC Havens

Proceedings 2013 IEEE International Conference on Big Data Big Data 2013 | Published : 2013

Abstract

Recent algorithmic and computational improvements have reduced the time it takes to build a minimal spanning tree (MST) for big data sets. In this paper we compare single linkage clustering based on MSTs built with the Filter-Kruskal method to the proposed clusiVAT algorithm, which is based on sampling the data, imaging the sample to estimate the number of clusters, followed by non-iterative extension of the labels to the rest of the big data with the nearest prototype rule. Numerical experiments with both synthetic and real data confirm the theory that clusiVAT produces true single linkage clusters in compact, separated data. We also show that single linkage fails, while clusiVAT finds high..

View full abstract

Grants

Awarded by ARC


Awarded by Equipment and Facilities scheme (LIEF)


Awarded by Australian Research Council


Funding Acknowledgements

We acknowledge the support from Australian Research Council (ARC) Research Network on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), ARC Linkage project grant (LP120100529) and the ARC Linkage Infrastructure, Equipment and Facilities scheme (LIEF) grant (LF120100129).